333 research outputs found

    A tool box for compiler construction

    Full text link

    Introduction to protein folding for physicists

    Get PDF
    The prediction of the three-dimensional native structure of proteins from the knowledge of their amino acid sequence, known as the protein folding problem, is one of the most important yet unsolved issues of modern science. Since the conformational behaviour of flexible molecules is nothing more than a complex physical problem, increasingly more physicists are moving into the study of protein systems, bringing with them powerful mathematical and computational tools, as well as the sharp intuition and deep images inherent to the physics discipline. This work attempts to facilitate the first steps of such a transition. In order to achieve this goal, we provide an exhaustive account of the reasons underlying the protein folding problem enormous relevance and summarize the present-day status of the methods aimed to solving it. We also provide an introduction to the particular structure of these biological heteropolymers, and we physically define the problem stating the assumptions behind this (commonly implicit) definition. Finally, we review the 'special flavor' of statistical mechanics that is typically used to study the astronomically large phase spaces of macromolecules. Throughout the whole work, much material that is found scattered in the literature has been put together here to improve comprehension and to serve as a handy reference.Comment: 53 pages, 18 figures, the figures are at a low resolution due to arXiv restrictions, for high-res figures, go to http://www.pabloechenique.co

    Flavor Singlet Meson Mass in the Continuum Limit in Two-Flavor Lattice QCD

    Get PDF
    We present results for the mass of the eta-prime meson in the continuum limit for two-flavor lattice QCD, calculated on the CP-PACS computer, using a renormalization-group improved gauge action, and Sheikholeslami and Wohlert's fermion action with tadpole-improved csw. Correlation functions are measured at three values of the coupling constant beta corresponding to the lattice spacing a approx. 0.22, 0.16, 0.11 fm and for four values of the quark mass parameter kappa corresponding to mpi over mrho approx. 0.8, 0.75, 0.7 and 0.6. For each beta, kappa pair, 400-800 gauge configurations are used. The two-loop diagrams are evaluated using a noisy source method. We calculate eta-prime propagators using local sources, and find that excited state contributions are much reduced by smearing. A full analysis for the smeared propagators gives metaprime=0.960(87)+0.036-0.248 GeV, in the continuum limit, where the second error represents the systematic uncertainty coming from varying the functional form for chiral and continuum extrapolations.Comment: 9 pages, 19 figures, 4 table

    Knowledge-based biomedical word sense disambiguation: comparison of approaches

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Word sense disambiguation (WSD) algorithms attempt to select the proper sense of ambiguous terms in text. Resources like the UMLS provide a reference thesaurus to be used to annotate the biomedical literature. Statistical learning approaches have produced good results, but the size of the UMLS makes the production of training data infeasible to cover all the domain.</p> <p>Methods</p> <p>We present research on existing WSD approaches based on knowledge bases, which complement the studies performed on statistical learning. We compare four approaches which rely on the UMLS Metathesaurus as the source of knowledge. The first approach compares the overlap of the context of the ambiguous word to the candidate senses based on a representation built out of the definitions, synonyms and related terms. The second approach collects training data for each of the candidate senses to perform WSD based on queries built using monosemous synonyms and related terms. These queries are used to retrieve MEDLINE citations. Then, a machine learning approach is trained on this corpus. The third approach is a graph-based method which exploits the structure of the Metathesaurus network of relations to perform unsupervised WSD. This approach ranks nodes in the graph according to their relative structural importance. The last approach uses the semantic types assigned to the concepts in the Metathesaurus to perform WSD. The context of the ambiguous word and semantic types of the candidate concepts are mapped to Journal Descriptors. These mappings are compared to decide among the candidate concepts. Results are provided estimating accuracy of the different methods on the WSD test collection available from the NLM.</p> <p>Conclusions</p> <p>We have found that the last approach achieves better results compared to the other methods. The graph-based approach, using the structure of the Metathesaurus network to estimate the relevance of the Metathesaurus concepts, does not perform well compared to the first two methods. In addition, the combination of methods improves the performance over the individual approaches. On the other hand, the performance is still below statistical learning trained on manually produced data and below the maximum frequency sense baseline. Finally, we propose several directions to improve the existing methods and to improve the Metathesaurus to be more effective in WSD.</p

    Evolution favors protein mutational robustness in sufficiently large populations

    Get PDF
    BACKGROUND: An important question is whether evolution favors properties such as mutational robustness or evolvability that do not directly benefit any individual, but can influence the course of future evolution. Functionally similar proteins can differ substantially in their robustness to mutations and capacity to evolve new functions, but it has remained unclear whether any of these differences might be due to evolutionary selection for these properties. RESULTS: Here we use laboratory experiments to demonstrate that evolution favors protein mutational robustness if the evolving population is sufficiently large. We neutrally evolve cytochrome P450 proteins under identical selection pressures and mutation rates in populations of different sizes, and show that proteins from the larger and thus more polymorphic population tend towards higher mutational robustness. Proteins from the larger population also evolve greater stability, a biophysical property that is known to enhance both mutational robustness and evolvability. The excess mutational robustness and stability is well described by existing mathematical theories, and can be quantitatively related to the way that the proteins occupy their neutral network. CONCLUSIONS: Our work is the first experimental demonstration of the general tendency of evolution to favor mutational robustness and protein stability in highly polymorphic populations. We suggest that this phenomenon may contribute to the mutational robustness and evolvability of viruses and bacteria that exist in large populations

    Robust probabilistic superposition and comparison of protein structures

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein structure comparison is a central issue in structural bioinformatics. The standard dissimilarity measure for protein structures is the root mean square deviation (RMSD) of representative atom positions such as α-carbons. To evaluate the RMSD the structures under comparison must be superimposed optimally so as to minimize the RMSD. How to evaluate optimal fits becomes a matter of debate, if the structures contain regions which differ largely - a situation encountered in NMR ensembles and proteins undergoing large-scale conformational transitions.</p> <p>Results</p> <p>We present a probabilistic method for robust superposition and comparison of protein structures. Our method aims to identify the largest structurally invariant core. To do so, we model non-rigid displacements in protein structures with outlier-tolerant probability distributions. These distributions exhibit heavier tails than the Gaussian distribution underlying standard RMSD minimization and thus accommodate highly divergent structural regions. The drawback is that under a heavy-tailed model analytical expressions for the optimal superposition no longer exist. To circumvent this problem we work with a scale mixture representation, which implies a weighted RMSD. We develop two iterative procedures, an Expectation Maximization algorithm and a Gibbs sampler, to estimate the local weights, the optimal superposition, and the parameters of the heavy-tailed distribution. Applications demonstrate that heavy-tailed models capture differences between structures undergoing substantial conformational changes and can be used to assess the precision of NMR structures. By comparing Bayes factors we can automatically choose the most adequate model. Therefore our method is parameter-free.</p> <p>Conclusions</p> <p>Heavy-tailed distributions are well-suited to describe large-scale conformational differences in protein structures. A scale mixture representation facilitates the fitting of these distributions and enables outlier-tolerant superposition.</p

    Using a virtual environment to assess cognition in the elderly

    Get PDF
    YesEarly diagnosis of Alzheimer’s disease (AD) is essential if treatments are to be administered at an earlier point in time before neurons degenerate to a stage beyond repair. In order for early detection to occur tools used to detect the disorder must be sensitive to the earliest of cognitive impairments. Virtual reality (VR) technology offers opportunities to provide products which attempt to mimic daily life situations, as much as is possible, within the computational environment. This may be useful for the detection of cognitive difficulties. We develop a virtual simulation designed to assess visuospatial memory in order to investigate cognitive function in a group of healthy elderly participants and those with a mild cognitive impairment. Participants were required to guide themselves along a virtual path to reach a virtual destination which they were required to remember. The preliminary results indicate that this virtual simulation has the potential to be used for detection of early AD since significant correlations of scores on the virtual environment with existing neuropsychological tests were found. Furthermore, the test discriminated between healthy elderly participants and those with a mild cognitive impairment (MCI)

    Oligomeric Structure of the MALT1 Tandem Ig-Like Domains

    Get PDF
    Mucosa-associated lymphoid tissue 1 (MALT1) plays an important role in the adaptive immune program. During TCR- or BCR-induced NF-κB activation, MALT1 serves to mediate the activation of the IKK (IκB kinase) complex, which subsequently regulates the activation of NF-κB. Aggregation of MALT1 is important for E3 ligase activation and NF-κB signaling.Unlike the isolated CARD or paracaspase domains, which behave as monomers, the tandem Ig-like domains of MALT1 exists as a mixture of dimer and tetramer in solution. High-resolution structures reveals a protein-protein interface that is stabilized by a buried surface area of 1256 Å(2) and contains numerous hydrogen and salt bonds. In conjunction with a second interface, these interactions may represent the basis of MALT1 oligomerization.The crystal structure of the tandem Ig-like domains reveals the oligomerization potential of MALT1 and a potential intermediate in the activation of the adaptive inflammatory pathway.This article can also be viewed as an enhanced version in which the text of the article is integrated with interactive 3D representations and animated transitions. Please note that a web plugin is required to access this enhanced functionality. Instructions for the installation and use of the web plugin are available in Text S1

    Climatic predictors of species distributions neglect biophysiologically meaningful variables

    Get PDF
    This is the final version. Available on open access from Wiley via the DOI in this record.Aim: Species distribution models (SDMs) have played a pivotal role in predicting how species might respond to climate change. To generate reliable and realistic predictions from these models requires the use of climate variables that adequately capture physiological responses of species to climate and therefore provide a proximal link between climate and their distributions. Here, we examine whether the climate variables used in plant SDMs are different from those known to influence directly plant physiology. Location: Global. Methods: We carry out an extensive, systematic review of the climate variables used to model the distributions of plant species and provide comparison to the climate variables identified as important in the plant physiology literature. We calculate the top ten SDM and physiology variables at 2.5 degree spatial resolution for the globe and use principal component analyses and multiple regression to assess similarity between the climatic variation described by both variable sets. Results: We find that the most commonly used SDM variables do not reflect the most important physiological variables and differ in two main ways: (i) SDM variables rely on seasonal or annual rainfall as simple proxies of water available to plants and neglect more direct measures such as soil water content; and (ii) SDM variables are typically averaged across seasons or years and overlook the importance of climatic events within the critical growth period of plants. We identify notable differences in their spatial gradients globally and show where distal variables may be less reliable proxies for the variables to which species are known to respond. Main conclusions: There is a growing need for the development of accessible, fine-resolution global climate surfaces of physiological variables. This would provide a means to improve the reliability of future range predictions from SDMs and support efforts to conserve biodiversity in a changing climate
    corecore